[Gauss Day Colloquium] Convex Optimization and Machine Learning

Abstract

This colloquium explores the fundamental relationship between convex optimization and machine learning, demonstrating why convex optimization serves as the mathematical foundation for many successful machine learning algorithms. Unlike general optimization problems that are practically impossible to solve globally, convex optimization problems possess the remarkable property that any local minimum is guaranteed to be a global minimum, enabling reliable and efficient algorithmic solutions.

The presentation establishes that all machine learning problems are fundamentally optimization problems, where we seek to minimize loss functions subject to various constraints. Through detailed mathematical formulations, we examine how classical machine learning methods including linear regression, ridge regression, LASSO, and support vector machines can all be cast as convex optimization problems. This framework provides not only theoretical guarantees of global optimality but also practical computational advantages through well-established algorithms like interior-point methods and gradient descent variants.

A significant portion of the discussion focuses on duality theory, which provides powerful tools for both algorithm development and optimality certification. The Lagrangian dual problem, always convex regardless of the primal problem’s convexity, offers lower bounds on optimal values and enables simultaneous solution of primal and dual problems. The Karush-Kuhn-Tucker (KKT) conditions emerge as necessary and sufficient optimality conditions for convex problems, generalizing the simple gradient-zero condition for unconstrained optimization.

The presentation then examines machine learning from multiple perspectives: statistical (maximum likelihood estimation and KL divergence minimization), numerical algorithmic (stochastic gradient descent and adaptive methods), and computer scientific viewpoints. These diverse perspectives reveal how convex optimization principles unify seemingly different approaches, from statistical inference to practical algorithm implementation, while highlighting the importance of regularization techniques for preventing overfitting.

The colloquium concludes by emphasizing that mastery of convex optimization is essential for applied scientists working in AI and machine learning. The mathematical rigor of convex optimization provides both theoretical understanding and practical tools necessary for developing robust, scalable machine learning systems. This foundation enables practitioners to move beyond heuristic approaches toward principled solutions with provable guarantees, ultimately advancing the field’s scientific maturity.

Sunghee Yun (He/Him)

Abstract